

# Full Custom Physical Design of 4-Bit High Speed CSLA Using Grid Based Routing and Standard Cells

## Somshekhar.R.Puranmath<sup>1</sup>, Archana.S.Kori<sup>2</sup>

Asst.Proffessor, Dept of ECE, K.L.E Institute of Technology, Hubballi, India<sup>1,2</sup>

Abstract: High speed addition and multiplication has always been a major requirement of high processing unit. The speed of addition and multiplication operations depends on speed of adders used in the design. Ripple carry adders (RCA) has better area utilization but has more delay, whereas the Carry Select Adder(CSLA) use two RCAs and increases the speed. The regular Carry Select Adder consists of dual ripple carry adders and multiplexers. The carry out calculated from the last stage i.e. least significant bit stage is used to select the actual calculated values of the output carry and sum of the next bit stages. The main disadvantage of regular CSLA is the large area due to multiple pairs of RCA and more power consumption. The modified CSLA using Common Boolean Logic (CBL) structure replaces the multiple use of RCA by using one inverter (INV) and OR gate. By using multiplexers, we can select the correct output result according to the logic state of carry-in signal from the previous bit stage. This structure consumes less area, delay and power. In this paper we have implemented two architectures of CSLA and simulated in cadence Spectre and compared the parameters like delay, area and power. The layout of CSLA using CBL logic is implemented using Cadence Layout editor-virtuoso with Standard cell library and grid routing techniques.

Keywords: Carry select adder, standard cell library, grid based routing.

#### I. **INTRODUCTION**

VLSI technology is a process that facilitates IC designers as 0. After the two partial results are calculated, the correct to combine many functions into a single chip which contains of billions of transistors and these ICs can be used for applications like microprocessor, memories, DSP. In VLSI technology, the main design constraints are area, power and speed. The main area of research in VLSI technology is optimization of area, power and delay. The performances of these processors usually depend on the arithmetic units used. The complex arithmetic operations can be easily decoded into simple elementary operations like addition and subtraction. Hence, the addition operation impacts the overall performance of a digital system like multipliers, DSP to execute algorithms like FIR, FFT etc. In this paper we have designed a standard cell based CSLA using CBL using cadence tools, where the area and power is reduced compared to other architectures of CSLA. The schematic is entered in composer-schematic studio and the circuit simulation and verification of the logic is done using cadence spectre. The customized layout of the proposed CSLA is implemented in cadence virtuoso tool and the DRC and LVS is cleared using cadence-Assura. The layout of the proposed adder is carried using self designed customized standard cell library, and the grid based routing is used in order to reduce area and errors in DRC and LVS.

#### SYSTEM ARCHITECTURE II.

The carry-select adder generally consists of two row of ripple carry adder to generate partial sum and carry and multiplexer is used to select the output. The first adder assumes the carry input as 1 and the other assumes carry the full adder is selected.

sum and carry is selected by mix. In this paper we have designed schematic of CSLA using RCA and CBL in cadence composer schematic.

CSLA using RCA A.

Fig.2.1: Regular CSLA using RCA

Fig.2.1 shows the basic building block of a carry-select adder, where the block size is 4.It has dual ripple carry adder with 2:1 mux. The main disadvantage of this adder is large area due to multiple pairs of ripple carry adders and it can be also seen in the table2.1 that mosfet count is 192 for this architecture. Since one ripple carry adder assumes a carry-in of 0 so we have used half adder, and the other assumes a carry-in of 1, by selecting the correct adder output from previous carry. Here the delay is reduced in compared to ripple carry adder. The carry output from the first block is given to the mux, if the carry out is logic zero then outputs from the half adder is selected, if the carry output is logic one then outputs from

C2 53



Table2.1 Mosfet and gate count for the design of CSLA using RCA

| Sl.No | Block<br>name | No. of<br>blocks | Gate<br>count | MOSFET<br>Count |
|-------|---------------|------------------|---------------|-----------------|
| 1     | Half<br>Adder | 3                | 2*3=6         | 36              |
| 2     | Full adder    | 4                | 5*4=20        | 120             |
| 3     | 2:1 Mux       | 6                | 4*6=24        | 36              |
| Total |               | 13               | 50            | 192             |

#### B. Proposed Architecture

The proposed architecture which utilizes common Boolean logic to eliminate the adder cells used in conventional carry select adders as shown in fig.2.2. This architecture reduces the transistor count to 126 as shown in the table2.2 ,which reduces the area compared to CSLA using RCA and achieve low power. By analysing the truth table of single bit full adder, we find that the summation f the inputs with carry signal as logic zero is the inverse of itself when carry signal is logic 1. Thus it requires only one OR gate and invert gate to generate the partial carry and summation signal.



Fig.2.2. Proposed CSLA using CBL.

The carry output from the full adder is given to mux and it selects the correct output based on the carry out from full adder. If c0 is at logic 0 then sum output from half adder is selected or else the inverted sum output from half adder is selected. Based on the same carry output from full adder, the next mux selects the next carry signal c1. The proposed architecture is area efficient and low power, but the speed is comparable to regular CSA.

Table 2.2. Mosfet and gate count for the proposed design of CSLA using CBL.

| Sl. No | Block    | No. of | Gate   | MOSFET |
|--------|----------|--------|--------|--------|
|        | name     | blocks | count  | count  |
| 1      | OR gate  | 3      | 3      | 18     |
| 2      | Half     | 3      | 3*2=6  | 36     |
|        | adder    |        |        |        |
| 3      | Full     | 1      | 5      | 30     |
|        | adder    |        |        |        |
| 4      | 2:1 mux  | 6      | 6*4=24 | 36     |
| 5      | NOT gate | 3      | 3      | 6      |
|        | Total    | 16     | 21     | 126    |

#### III. DESIGN METHODOLOGY

#### A. Front End Design

Since mobility of electrons is more than the mobility of holes the resistance of the pmos is more compared to the nmos hence the strength of the pmos is less. The transistor sizing is carried for the resistance of both the mosfets to be equal; hence width of the pmos is made twice that of nmos which equalizes the resistance approximately. With this width of pmos, we have designed all the basic gates and built the top level module of CSLA. The symbol of the modules in our design is created using cadence schematic composer. Using these symbols, the functionality of the module is tested by creating appropriate test benches and thereby verifying the simulation results. The simulation and result analysis is carried out using ADE-cadence Spectre-Virtusuo. The complete flow adopted for the front end design is as shown in fig.3.1.



Fig.3.1. Front end design flow

#### B. Back End Design

After the simulation results are found satisfactory in the front end the circuits is converted into geometric representation which is the beginning of the physical design. Our physical design flow is full custom design using standard cells and gird based routing.

The grid is created using hilite layer in cadence virtusuo layout tool. Using this grid, the height of the standard cell is defined and optimized. A tap cell is designed using the standard height in order to avoid latch-up error.

With the same standard height, the layouts of the subblocks are created in cadence virtusuo layout and a standard cell library is created. The Floor planning for the top module is carried out manually such that the aspect ratio is less than 2, pin positions are well defined and signal flow is taken care. For the top module, the components are placed and routing between them is carried out using metal. The top module is verified using cadence assura tool for DRC and LVS. The flow adopted for back end design is as shown in fig.3.2.





Fig.3.2. Back end design flow.

## **IV.SIMULATION RESULTS AND ANALYSIS**

The Schematic of sub-blocks and top modules of the CSLA using RCA and CSLA using CBL is created in Cadence schematic composer and the functionality is verified using test benches. The layouts of sub-blocks and the top module of CSLA using CBL are created in cadence virtusuo layout editor with DRC and LVS clean. Fig.4.1 and fig.4.2 shows the top level connections of the CSLA using RCA and using CBL.



Fig. 4.1. Top level schematic of CSLA using RCA.



Fig.4.2. Top level Schematic of CSLA using CBL.

The Fig.4.3 shows the functionality of the CSLA using CBL with 4-bit two inputs from a0 to a3 and b0 to b3 with carry input and the output sum from s0 to s3 and a carry out.







Fig.4.3. Simulated output waveform of top level module of CSLA using CBL.

The layouts of the sub modules- half adder, full adder and MUX are shown in the fig.4.4, fig.4.5 and fig.4.6.





Fig.4.5. Layout of full adder.





The routing of the top modules of CSLA using CBL with different metal routing is as shown in fig.4.7. The fig.4.8 and fig.4.9 show the top module with DRC and LVS clear. During the fabrication of mosfets temperature varies at each point on the wafer and etching process is also not uniform, hence analysis for variation in temperature and process parameters of top level architecture is carried out. The worst case under which the IC will operate is identified.



Fig.4.9. CSLA using CBL with LVS clear

The process variation affects the speed of PMOS and NMOS devices which can either get slow of fast and hence there are four process corners as listed below

Slow Slow (SS)
Slow Fast (SF)
Fast Slow (FS)
Fast Fast (FF)

The worst operating condition is when the power and delay values are high for a given design. For our design the worst case for power is at a corner of FF and at temperature of  $125^{0}$ C as shown in figure 4.10.



Fig.4.10. Variation of power with temperature and process corner.

Similar analysis on delay shows that the worst appears at FS corner and at a temperature of  $125^{0}$ C as shown in fig4.11



Fig.4.11. Variation of delay with temperature and process corner

The table.4.1 shows the comparison of CSLA using RCA and CBL in terms of power, transistor count and delay.

Table 4.1. Comparison of CSLA using RCA and CBL in terms of power, delay and area.

| Name | Power    | Area -<br>Transistor<br>count | Delay |
|------|----------|-------------------------------|-------|
| RCA  | 25.024µw | 192                           | 680ns |
| CBL  | 20.95 μw | 126                           | 650ns |

This shows that the architecture for CBL can operate at higher speed and consumes less power.



#### V. CONCLUSION

In this paper an innovative approach to reduce area, power and delay of CSLA architecture is proposed. Architecture of CBL consumes less number of mosfets hence a more compact design was created.Floor planning of the top module is done manually to meet the constraints of aspect ratio given and the routing of the top module was done following the grid based methodology. Standard cell library was designed with optimum height which reduced the silicon area for the top module. Simulation results and analysis shows that CSLA using CBL is faster than RCA with a delay of 650ns. And the standard cell based full custom VLSI design of CSLA optimised our area which consumes 2244.72µm<sup>2</sup>

#### REFERENCES

- Ms. S. Manju, Mr. V. Sornagopal, "An Efficient SQRT Architecture of carry select adder design by Common Boolean Logic," 978-1-4673-5301-4/13/ -2013 IEEE.
- [2] Khosrow Golshan, "Physical design essentials," Springer-2007.
- [3] R.Jacob Baker, "Cmos circuit design, layout and simulation".
- [4] Sung mo kang, "cmos digital integrated circuits," 3<sup>rd</sup> edition.
- [5] Douglas A. Pucknell, Kamran Eshraghian, "Basic VLSI Design," 3<sup>rd</sup> edition.
- [6] R.UMA, Vidya Vijayan,M. Mohanapriya, Sharon Paul "Area, Delay and Power Comparison of Adder Topologies," international Journal of VLSI design & Communication Systems (VLSICS) Vol.3, No.1, February 2012.

#### BIOGRAPHIES



**Somshekhar R Puranmath** was born in dandeli, Karnataka. He completed his B.Tech from Bangalore Institute of Technology, VTU, Bangalore, in telecommunication and completed his M.tech from B.V.B.C.E.T, hubballi in

Vlsi Design and Testing. He joined department of electronics and communication at K.L.E Institute of Technology as lecturer in 2012 and is currently working as Assistant professor at K.L.E Institute of Technology. His areas of research includes analog and mixed mode physical vlsi design, image processing on FPGA, voltage reference circuits and low power cmos vlsi design.



Archana Kori was born in hospet, Karnataka. She completed her B.Tech from K.L.E Institute of Technology, VTU, hubballi, in electronics and communication and completed her M.tech from Ballari Institute of

Technology and Management, VTU, ballari in Vlsi and embedded systems. She is currently working as Assistant professor at K.L.E Institute of Technology. Her areas of research include digital vlsi circuit design and analog vlsi design.